Experimental GTFN Executor Caching #1197

tehrengruber · 2023-03-06T13:05:00Z

A small caching mechanism that makes repeated execution of stencils with the GTFN backend much faster. We should integrate this in a better way in the OTF pipeline, but this makes live much easier for now. ~~I have left the caching strategy as SESSION for now, but for people that know what they are doing running with cache.Strategy.PERSISTENT can make executing all tests a breeze (time-wise :-)).~~

The caching of the GTFN executor can now be configured:

use_caching: bool = False
caching_strategy = cache.Strategy.SESSION

This feature is experimental!

Extracted from #965

src/gt4py/next/program_processors/runners/gtfn_cpu.py

DropD

Since there are no tests that allow us to say with any confidence that this caching has no problem with false positives, there should be at least a way to switch it off. This should probably then be the default case on CI, if not the default case overall.

Even if it looks obvious now that it should do the right thing, there is nothing in place that ensures we would notice if it gets broken down the line.

tehrengruber · 2023-03-06T16:56:57Z

I fully agree, do you have a proposal on how to disable the caching without to much effort? We could make the caching an option in the GTFNTranslator, similar to use_imperative_backend, and make it default to False. If we also make the cache.Strategy a configurable option, this would fulfill all requirements with respect to caching I have currently. In the long term we should streamline the mechanism of passing backend options from the frontend, but for the time being this would be sufficient I believe.

DropD · 2023-03-07T10:36:18Z

I fully agree, do you have a proposal on how to disable the caching without to much effort? We could make the caching an option in the GTFNTranslator, similar to use_imperative_backend, and make it default to False. If we also make the cache.Strategy a configurable option, this would fulfill all requirements with respect to caching I have currently. In the long term we should streamline the mechanism of passing backend options from the frontend, but for the time being this would be sufficient I believe.

Let's go with the options in GTFNExecutor for now. However, we seem to be accumulating them already so we shouldn't wait too long on a workflow based solution.

DropD · 2023-03-09T13:33:57Z

src/gt4py/next/otf/workflow.py

+
+
+@dataclasses.dataclass(frozen=True)
+class CachedWorkflow(Generic[StartT, EndT]):


Why is this not a Step subclass? It lives on the same level, has the same interoperability etc...

Either case, the current naming is Workflow for the concept and XyzStep for all the concrete implementations (Any workflow can be a step in a super-workflow anyway). So naming this Concrete implementation CachedWorkflow makes it potentially confusing.

src/gt4py/next/otf/workflow.py

DropD · 2023-03-09T13:38:44Z

src/gt4py/next/program_processors/runners/gtfn_cpu.py

+            # TODO(tehrengruber): as the frontend types contain lists they are
+            #  not hashable. As a workaround we just use content_hash here.


If this comment only explains why content_hash is used here, it does not need to be a TODO. Else it should contain something like "make frontent types hashable" or at least "verify this is the best solution".

src/gt4py/next/program_processors/runners/gtfn_cpu.py

DropD · 2023-03-14T09:47:27Z

looks like the doctests have gone out of sync, but once they are fixed, it's ready to merge

tehrengruber · 2023-03-17T10:50:25Z

src/gt4py/next/otf/workflow.py

+        if hash_ in self._cache:
+            return self._cache[hash_]
+        return self._cache.setdefault(hash_, self.workflow(inp))


Suggested change

if hash_ in self._cache:

return self._cache[hash_]

return self._cache.setdefault(hash_, self.workflow(inp))

try:

result = self._cache[hash_]

except KeyError:

result = self._cache[hash_] = self.workflow(inp)

return result

suggested by @egparedes

tehrengruber added 2 commits March 6, 2023 14:01

Poor man's cache for gtfn backend

f78c958

Tiny fix

5fd56f7

DropD reviewed Mar 6, 2023

View reviewed changes

src/gt4py/next/program_processors/runners/gtfn_cpu.py Outdated Show resolved Hide resolved

DropD requested changes Mar 6, 2023

View reviewed changes

Add CachedWorkflow

117838e

tehrengruber changed the title ~~Poor man's cache for gtfn backend~~ Experimental GTFN Executor Caching Mar 8, 2023

Fix format

abd283e

tehrengruber requested a review from DropD March 8, 2023 15:21

DropD requested changes Mar 9, 2023

View reviewed changes

tehrengruber added 2 commits March 11, 2023 17:39

Address reviewer comments

f9cbfaa

Address reviewer comments

952bb61

tehrengruber requested a review from DropD March 11, 2023 16:48

DropD approved these changes Mar 14, 2023

View reviewed changes

tehrengruber commented Mar 17, 2023

View reviewed changes

tehrengruber added 2 commits March 17, 2023 14:19

Fix doctests and address comment from @egparedes

1aa59df

Merge remote-tracking branch 'origin/main' into poor_mans_caching

ce7b57b

tehrengruber merged commit b802910 into GridTools:main Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental GTFN Executor Caching #1197

Experimental GTFN Executor Caching #1197

tehrengruber commented Mar 6, 2023 •

edited

Loading

DropD left a comment

tehrengruber commented Mar 6, 2023

DropD commented Mar 7, 2023

DropD Mar 9, 2023

DropD Mar 9, 2023

DropD commented Mar 14, 2023

tehrengruber Mar 17, 2023 •

edited

Loading



		@dataclasses.dataclass(frozen=True)
		class CachedWorkflow(Generic[StartT, EndT]):

		# TODO(tehrengruber): as the frontend types contain lists they are
		# not hashable. As a workaround we just use content_hash here.

-        if hash_ in self._cache:
-            return self._cache[hash_]
-        return self._cache.setdefault(hash_, self.workflow(inp))
+        try:
+            result = self._cache[hash_]
+        except KeyError:
+            result = self._cache[hash_] = self.workflow(inp)
+        return result

Experimental GTFN Executor Caching #1197

Experimental GTFN Executor Caching #1197

Conversation

tehrengruber commented Mar 6, 2023 • edited Loading

DropD left a comment

Choose a reason for hiding this comment

tehrengruber commented Mar 6, 2023

DropD commented Mar 7, 2023

DropD Mar 9, 2023

Choose a reason for hiding this comment

DropD Mar 9, 2023

Choose a reason for hiding this comment

DropD commented Mar 14, 2023

tehrengruber Mar 17, 2023 • edited Loading

Choose a reason for hiding this comment

tehrengruber commented Mar 6, 2023 •

edited

Loading

tehrengruber Mar 17, 2023 •

edited

Loading